049011: Algorithms for Large Data Sets 1.2 Problem Definition
نویسنده
چکیده
We will continue the study of the Distinct Elements problem in the data stream model [6, 1, 3]. Its goal is to find a number of distinct elements in a stream of input elements. The length of the input stream can be huge, so storing all the input elements in memory is not an option. According to the data stream model, the input elements arrive and have to be processed serially, not allowing random access to any ”past” or ”future” elements. We will use the following notation:
منابع مشابه
Spatial Design for Knot Selection in Knot-Based Low-Rank Models
Analysis of large geostatistical data sets, usually, entail the expensive matrix computations. This problem creates challenges in implementing statistical inferences of traditional Bayesian models. In addition,researchers often face with multiple spatial data sets with complex spatial dependence structures that their analysis is difficult. This is a problem for MCMC sampling algorith...
متن کاملSolving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...
متن کاملSolving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...
متن کاملGeneral form of a cooperative gradual maximal covering location problem
Cooperative and gradual covering are two new methods for developing covering location models. In this paper, a cooperative maximal covering location–allocation model is developed (CMCLAP). In addition, both cooperative and gradual covering concepts are applied to the maximal covering location simultaneously (CGMCLP). Then, we develop an integrated form of a cooperative gradual maximal covering ...
متن کاملFuzzy clustering of time series data: A particle swarm optimization approach
With rapid development in information gathering technologies and access to large amounts of data, we always require methods for data analyzing and extracting useful information from large raw dataset and data mining is an important method for solving this problem. Clustering analysis as the most commonly used function of data mining, has attracted many researchers in computer science. Because o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005